16 research outputs found

    Use of routinely collected national data sets for reporting on induced abortion in Australia

    Get PDF
    Foreword The lack of national data on induced abortion in Australia represents a gap in health statistics. The AIHW’s Reproductive Health Indicators in Australia 2002 report included an indicator on induced abortions in Australia, but national data were not reported for it because data on induced abortion were not available on a routine basis Australia-wide. This report comprehensively assesses the extent to which different forms of routinely collected data can be used to quantify the incidence of induced abortion in Australia. The innovative use of data combined from hospital and non-hospital sources helps to provide a more complete picture of reproductive health in Australia, as well as providing a basis for regular reporting in the future. The compilation of the data contained in this document represents the best effort to date to provide a factual database on the incidence of induced abortion. The report does not include any analysis of the legal, social or moral issues often raised in discussion of abortion

    Classification of Cancer-related Death Certificates using Machine Learning

    Get PDF
    BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method  A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifi

    Cancer in Australia 1999

    Full text link
    Cancer in Australia 1999 presents comprehensive national data on cancer incidence and mortality and summary data on screening, survival, inpatient hospital and general practice episodes, risk factors, and the cancer workforce. The report provides 1999 data for cancer by site, age and sex, and summary data for each State and Territory. Incidence and mortality trends since the early 1980s and age patterns for selected cancers are features of this report. Cancer in Australia 1999 is an important reference from the Cancer Series for all those interested in the health of Australians

    Australia\u27s health 2000 : the seventh biennial report of the Australian Institute of Health and Welfare

    Full text link
    Australia\u27s Health 2000 is the seventh biennial health report of the Australian Institute of Health and Welfare. It is the nation\u27s authoritative source of information on patterns of health and illness, determinants of health, the supply and use of health services, and health services costs and performance.This 2000 edition serves as a summary of Australia\u27s health record at the end of the twentieth century. In addition, a special chapter is presented on changes in Australia\u27s disease profile over the last 100 years.Australia\u27s Health 2000 is an essential reference and information source for all Australians with an interest in health

    Australia\u27s health 2002 : the eighth biennial report of the Australian Institute of Health and Welfare

    Full text link
    Australia\u27s Health 2002 is the eighth biennial health report of the Australian Institute of Health and Welfare. It is the nation\u27s authoritative source of information on patterns of health and illness, determinants of health, the supply and use of health services, and health service costs and performance. Australia\u27s Health 2002 is an essential reference and information resource for all Australians with an interest in health

    Trading in food safety? The impact of trade agreements on quarantine in Australia

    No full text
    Australia\u27s food safety and quarantine standards are coming under increasing pressure from bilateral trade agreements such as the Australia-US FTA. Despite the risks to Australian health and agriculture, quarantine standards have already been lowered to accommodate increased trade. Further concessions are likely in negotiations for FTAs with China, Thailand and other countries. Hilary Bambrick argues that trade agreements should be used as a mechanism to raise world standards for food safety and quarantine rather than lowering standards to the lowest common denominator. A summary of her report is available via the link below

    Automatic de-identification of electronic health records: an Australian perspective

    Get PDF
    We present an approach to automatically de-identify health records. In our approach, personal health information is identified using a Conditional Random Fields machine learning classifier, a large set of linguistic and lexical features, and pattern matching techniques. Identified personal information is then removed from the reports. The de-identification of personal health information is fundamental for the sharing and secondary use of electronic health records, for example for data mining and disease monitoring. The effectiveness of our approach is first evaluated on the 2007 i2b2 Shared Task dataset, a widely adopted dataset for evaluating de-identification techniques. Subsequently, we investigate the robustness of the approach to limited training data; we study its effectiveness on different type and quality of data by evaluating the approach on scanned pathology reports from an Australian institution. This data contains optical character recognition errors, as well as linguistic conventions that differ from those contained in the i2b2 dataset, for example different date formats. The findings suggest that our approach compares to the best approach from the 2007 i2b2 Shared Task; in addition, the approach is found to be robust to variations of training size, data type and quality in presence of sufficient training data

    Automatic ICD-10 classification of cancers from free-text death certificates

    Get PDF
    Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates

    Extracting cancer mortality statistics from death certificates: a hybrid machine learning and rule-based approach for common and rare cancers

    No full text
    Objective Death certificates are an invaluable source of cancer mortality statistics. However, this value can only be realised if accurate, quantitative data can be extracted from certificates—an aim hampered by both the volume and variable quality of certificates written in natural language. This paper proposes an automatic classification system for identifying all cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. The features were used as input to two different classification sub-systems: a machine learning sub-system using Support Vector Machines (SVMs) and a rule-based sub-system. A fusion sub-system then combines the results from SVMs and rules into a single final classification. A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. Results The system was highly effective at determining the type of cancers for both common cancers (F-measure of 0.85) and rare cancers (F-measure of 0.7). In general, rules performed superior to SVMs; however, the fusion method that combined the two was the most effective. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates

    The impact of OCR accuracy on automated cancer classification of pathology reports

    No full text
    Objective\ud \ud To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports.\ud \ud Method\ud \ud Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports.\ud \ud Results\ud \ud The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc.\ud \ud Conclusions\ud \ud The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items
    corecore